Community consensus on core open science practices to monitor in biomedicine

The state of open science needs to be monitored to track changes over time and identify areas to create interventions to drive improvements. In order to monitor open science practices, they first need to be well defined and operationalized. To reach consensus on what open science practices to monitor at biomedical research institutions, we conducted a modified 3-round Delphi study. Participants were research administrators, researchers, specialists in dedicated open science roles, and librarians. In rounds 1 and 2, participants completed an online survey evaluating a set of potential open science practices, and for round 3, we hosted two half-day virtual meetings to discuss and vote on items that had not reached consensus. Ultimately, participants reached consensus on 19 open science practices. This core set of open science practices will form the foundation for institutional dashboards and may also be of value for the development of policy, education, and interventions.


Introduction
In November 2021, UNESCO adopted its Recommendation on Open Science, defining open science "as an inclusive construct that combines various movements and practices aiming to make multilingual scientific knowledge openly available, accessible and reusable for everyone, to increase scientific collaborations and sharing of information for the benefits of science and society, and to open the processes of scientific knowledge creation, evaluation and communication to societal actors beyond the traditional scientific community" [1]. UNESCO recommends that its 193 member states take action towards achieving open science globally. The recommendation emphasizes the importance of monitoring policies and practices in achieving this goal [1]. Open science provides a means to improve the quality and reproducibility of research [2,3], and a mechanism to foster innovation and discovery [4,5]. The UNESCO Recommendation has cemented open science's position as a global science policy priority. It follows other initiatives from major research funders, such as the Open Research Funders Group, as well as national efforts to implement open science via federal open science plans [6,7].
Despite these commitments from policymakers and funders, adopting and implementing open science has not been straightforward. There remains debate about how to motivate and incentivize individual researchers to adopt open science practices [8][9][10], and how to best track open science practices within the community. A key concern is the need for funding to cover the additional fees and time costs needed to adhere to some open science best practices, when the academic reward system and career advancement still incentivize traditional, closed research practices. What "counts" in the tenure process is typically the outwardly observable number of publications in prestigious-typically high impact factor and often paywalledjournals, rather than efforts towards making research more accessible, shareable, transparent, and reusable. Monitoring open science practices is essential if the research community intends to evaluate the impact of policies and other interventions to drive improvements, and understand the current adoption of open science practices in a research community. To improve their open science practices, institutions need to measure their performance; however, there is presently no effective system for efficient and large-scale monitoring without significant effort.
Consider the example of open access publishing. A researcher-led large analysis of researchers' compliance with funder mandates for open access publishing showed that the rate of adherence varied considerably by funder [11]. In Canada, the Canadian Institutes of Health Research (CIHR) had an open access requirement for depositing articles between 2008 and 2015. This deposit requirement was modified when CIHR and the other two major Canadian funding agencies harmonized their policies. The result was a drop in openly available CIHRfunded research from approximately 60% in 2014 to approximately 40% in 2017 [11]. In the absence of monitoring, it is not possible to evaluate the impact of introducing a new policy or to measure how other changes in the scholarly landscape impact open science practices.
The Coronavirus Disease 2019 (COVID-19) pandemic has created increased impetus for, and attention to, open science, which has contributed to the development of new disciplinespecific practices for openness [12][13][14]. The current project aimed to establish a core set of open science practices within biomedicine to implement and monitor at the institutional level (Box 1). Our vision to establish a core set of open science practices stems from the work of Core Outcome Measures in Effectiveness Trials (COMET) [15]. If trialists agree on a few core outcomes to assess across trials, it strengthens the totality of evidence, enables more meaningful use in systematic reviews, promotes meta-research, and may subsequently reduce waste in research. We sought to apply this concept of community-agreed standardization to open science specifically in biomedical research, which currently lacks consensus on best practices, and work to operationalize different open science practices.

Box 1. Summary of key points
• Funders and other stakeholders in the international research ecosystem are increasingly introducing mandates and guidelines to encourage open science practices.
• Research institutions cannot currently monitor compliance with open science practices without engaging in time-consuming manual processes that many lack the expertise to undertake.
• We conducted an international Delphi study to agree which open science practices would be valuable for research institutions to monitor, with a view to creating an automated dashboard to support monitoring.
• We report 19 open science practices that reached consensus for institutional monitoring in an open science dashboard and describe how we intend to implement these.
• The open science practices identified may be of broader value for developing policy, education, and interventions.
The core set of open science practices identified here will serve the community in many ways, including in developing policy, education, or other interventions to support the implementation of these practices. Most immediately, the practices can inform the development of an automated open science dashboard that can be deployed by biomedical institutions to efficiently monitor adoption of (and provide feedback on) these practices. By establishing what should be reported in an institutional open science dashboard through a consensus building process with relevant stakeholders, we aim to ensure the tool is appropriate to the needs of the community.

Ethics statement
This study received ethical approval from the Ottawa Health Science Network Research Ethics Board (20210515-01H). Participants were presented with an online consent form prior to viewing round 1 of the Delphi, their completion of the survey was considered implied consent. For complete study methods, please see S1 Text. We conducted a 3-round modified Delphi survey study. Delphi studies structure communication between participants to establish consensus [16]. Typically, Delphi studies use several rounds of surveys in which participants, experts in the topic area, vote on specific issues. Between rounds, votes are aggregated and anonymized and then presented back to participants along with their own individual scores, and feedback on others' anonymized voting decisions [17,18]. This gives participants the opportunity to consider the group's thoughts and to compare and adjust their own assessment in the next round. A strength of this method of communication is that it allows all individuals in a group to communicate their views. Anonymous voting also limits direct confrontation among individuals and the influence of power dynamics and hierarchies on the group's decision.
Participants in our Delphi were from a convenience sample obtained through snowball sampling of academic institutions interested in open science. The individuals from the institutions represented any/all of the following groups: 1. Library or scholarly communication staff (e.g., responsible for purchasing journal content, responsible for facilitating data sharing or management).
3. Staff involved in researcher assessment (e.g., appointment and tenure committee members).
4. Individuals involved in institutional metrics assessment or reporting (e.g., performance management roles).
Because titles and roles differ from institution to institution, we left it to the discretion of the institution to identify participants. Broadly, we aimed to include people who either knew about scholarly metrics or made decisions regarding researcher assessment or hiring. We also explicitly encouraged the institutions to consider diversity of their representing participants (including gender and race) when inviting people to contribute. However, there are a variety of stakeholders that may influence institutional monitoring of open science practices. A limitation of the current work is that we included exclusively participants directly employed by academic institutions. While our intention is to implement the proposed dashboard at biomedical institutions, it is possible we missed nuance or richness, for example, by failing to include representatives from scholarly publishers, academic societies, or funding agencies.
The first two rounds of the Delphi were online surveys administered using Surveylet. Surveylet is a purpose-built platform for developing and administering Delphi surveys [19]. To start with, the Delphi participants were presented with an initial set of 17 potential open science practices to consider that were generated by the project team based on a discussion. Round 3 took the form of two half-day meetings hosted on Zoom [20]. Hosting round 3 in the form of an online meeting is a modification of the traditional Delphi approach. This was done to provide an opportunity for more nuanced discussion among participants about the potential open science practices while still retaining anonymized online voting. We opted for a virtual meeting given the COVID-19 pandemic restrictions at the time and the cost effectiveness for enabling international participation. However, while our use of a modified Delphi in which round 3 took place online provided the opportunity for more nuanced discussion prior to voting, it also meant that we ultimately reduced the overall number of participants taking part in that round in order to host a manageable sized group for the online meeting. This methodological approach may have reduced some of the diversity in potential response despite providing greater richness in responses.
While the structured, anonymous, and democratic approach of the Delphi process offers many advantages to reaching consensus, it is not without limitations. The methods used here may have impacted our outcome. For example, the use of a forced choice item rather than a scale in rounds 2 and 3 may have contributed to a greater likelihood for items to reach consensus in these rounds. While we endeavored to attract a diverse and representative sample of institutions to contribute, ultimately given our sampling approach, it is likely that the participants and institutions that agreed to take part may not be as representative of the global biomedical research culture as we desired, and may have a stronger interest in or commitment to open science than is typical. While the sample may not be generalizable, these institutions likely represent early adopters or willing leaders in open science. Further, our Delphi surveys and consensus meetings were conducted in English only, and the meeting was not conducive for attendance across all time zones. These factors will have created barriers to participation for some institutions or participants. Defining who is an "expert" to provide their views in any Delphi exercise provides an inherent challenge [21]. We faced this challenge here, especially considering the diversity of open science practices and the nuances of applying these practices in distinct biomedical subdisciplines. For example, our vision to create a single biomedical dashboard to deploy at the institutional level may mean we have missed nuances in open science practices in preclinical as compared to clinical research.

Round 1
Participants: We excluded participants who did not complete 80% or more of the survey in this round. A total of 80 participants from 20 institutions in 13 countries completed round 1. Full demographics are described in Table 1. A total of 44 (55.0%) participants identified as men, 35 (43.8%) as women, and 1 (1.3%) as another gender. Of the 32 research institutions that were invited to contribute to the study, 20 (62.5%) ended up contributing, and 1 to 7 participants from each organization responded to our survey. Researchers (N = 31, 38.8%) and research administrators (N = 18, 22.5%) comprised most of the sample.
Voting: Of the 17 potential core open science practices presented in round 1, two reached consensus. Participants agreed that "registering clinical trials on a registry prior to recruitment" and "reporting author conflicts of interest in published articles" were essential to include. See full results in Table 2.
Participants suggested 10 novel potential core open science practices to include in round 2 for voting; they were as follows: use of Research Resource Identifiers (RRIDs) where relevant biological resources are used in a study; inclusion of funder statements; information on whether a published paper has open peer reviews available (definitions vary for open peer review [22], but we define this as having transparent peer reviews available); sharing a data management plan; use of open licenses when sharing data/code/materials; use of nonproprietary software when sharing data/code/materials; use of persistent identifiers when sharing data/code/materials; sharing research workflows in computational environments; reporting on the gender composition of the authorship team; and reporting results of trials in a manuscript-style publication (peer reviewed or preprint) within 2 years of study completion.

Round 2
Participants: Fifty-six (70% of round 1) participants completed the round 2 survey (see Table 1). Of the 20 research institutions that completed round 1, 19 (95%) institutions continued their contributions in round 2, with up to 5 participants from each organization responding to our survey. Researchers (N = 23, 41.1%) and research administrators (N = 11, 19.6%) again comprised most of the sample, as in round 1.
Voting: Of the 15 potential core open science practices that participants had not reached consensus on in round 1, 6 reached consensus in round 2. Participants agreed that the following practices were essential to reporting in the dashboard: whether data were shared openly at the time of publication (with limited exceptions); whether code was shared openly at the time of publication (with limited exceptions); whether reporting guideline checklists were used; whether author contributions were described; whether ORCID identifiers were used; and whether registered clinical trials were reported in the registry within 2 years of study completion. Participants then ranked the 10 novel potential core open science practices suggested by participants in round 1 for the first time. None of these 10 new practices reached consensus in round 2. There were no other explicitly described practices suggested by participants in round 2 to consider for the dashboard in round 3.

Round 3
Participants: Twenty-one participants were present on day 1 and 17 on day 2 of the consensus meeting. Full demographics are described in Table 1. One participant on each day did not provide any demographic information.
Voting: There were 19 items that had not reached consensus in round 2. After discussing each item, some were reworded slightly, expanded into two items, or collapsed into a single item (see notes on modifications made in Table 2). Ultimately, participants voted on 22 potential open science practices in round 3. One of these items asked participants to vote on "reporting whether registered clinical trials were reported in the registry within 1 year of study completion." An item describing "reporting that registered clinical trials were reported in the registry within 2 years of study completion" reached consensus in round 2; however, several participants commented that the timeframe was inconsistent with requirements of funders that have signed the World Health Organization joint statement on public disclosure of results from clinical trials, which specified 12 months. Based on this, participants were asked to revote on this item using the 1-year cutoff.
Of the 22 potential items voted on in round 3, 12 reached consensus for inclusion: whether systematic reviews have been registered; whether there was a statement about study materials sharing with publications; the use of persistent identifiers when sharing data/code/materials; whether data/code/materials are shared with a clear license; whether the data/code/materials license is open or not; citations to data; what proportion of articles are published open access with a breakdown of time delay; the number of preprints; that registered clinical trials were reported in the registry within 1 year of study completion; trial results in a manuscript-style publication (peer reviewed or preprint); systematic review results in a manuscript-style publication (peer reviewed or preprint); and whether research articles include funding statements. One item reached consensus for exclusion from the dashboard: Reporting whether workflows in computational environments were shared. Participants agreed this item should be a component of the existing item, "reporting whether code was shared openly at the time of publication (with limited exceptions)." Participants discussed how some of the items that reached consensus for inclusion represented essential practices more broadly related to transparency or reporting than practices generally considered traditional open science procedures. Following round 3, items that  reached consensus were grouped based on these broad categories (traditional open science versus broader transparency practices for reporting) and participants were asked to rank the practices based on how they should be prioritized for programming for inclusion in our proposed dashboard (Table 3). Items with higher scores represent those that were given a higher priority. The top two traditional open science practices by priority were reporting whether clinical trials were registered before they started recruitment, and reporting whether study data were shared openly at the time of publication (with limited exceptions). The top two broader transparency practices by priority were reporting whether author contributions were described, and reporting whether author conflicts of interest were described.

Consensus core open science practices
Below we briefly consider each of the core practices that reached consensus and discuss the process of implementing each. A total of 19 practices reached consensus for inclusion in the dashboard.
Traditional open science practices 1. Reporting whether clinical trials were registered before they started recruitment. This practice is required by several organizations and funders internationally. Despite clear mandates for registration, we know this practice is not optimal [23]. Standardized reporting of trial registration will allow for linkage of trial outputs to the registry and help contribute to the reduction of selective outcome reporting and non-reporting.
2. Reporting whether study data were shared openly at the time of publication (with limited exceptions). Policies encouraging and mandating open data are growing. This practice considers whether there is a statement about open data in a publication. It does not require that this statement indicate that data are in fact publicly available. As culture around data sharing becomes more normative, it may be of value to reevaluate whether tracking the proportion of openly available data is of value. To do so effectively will require changes in the culture around and use of DOIs. Information on the data available and its useability would be essential to provide quality control and for an individual to determine not just if data can be used, but whether it should be used for the intended purpose. Exceptions would include nonempirical pieces (e.g., a study protocol). 4. Reporting whether study code was shared openly at the time of publication (with limited exceptions). Similar to practice 2, this practice considers whether there is a statement about open code sharing in the publication. It does not require that this statement indicate that code is in fact publicly available. As culture around code sharing becomes more normative, information about the quality and type of code shared and compliance to best practices (e.g., FAIR principles) may be valuable to monitor. Exceptions would include nonempirical pieces.

5.
Reporting whether systematic reviews have been registered. This practice is required by some journals and is common within knowledge synthesis projects. Standardized reporting of systematic review registration will allow for linkage of review outputs to the registry and help contribute to reduce unnecessary duplication in reviews.
6. Reporting that registered clinical trials were reported in the registry within 1 year of study completion. The practice of reporting trial results in the registry they were first registered in is required by several organizations and funders. This practice would track the proportion of trials in compliance with reporting results within 1 year or study completion.

7.
Reporting whether there was a statement about study materials sharing with publications. This practice considers whether there is a statement about materials sharing with a publication. It does not consider whether or not materials are indeed shared openly. As with data and code sharing, materials sharing is not yet widespread across biomedicine. As a starting point, statements about materials sharing will be monitored, but in time, it may be of value to track the frequency of materials sharing at an institution. This could inform infrastructure needs.
8. Reporting whether study reporting guideline checklists were used. Reporting guidelines are checklists of essential information to include in a manuscript; these are widely endorsed by medical journals and have been shown to improve the quality of reporting of publications [24]. This item would track whether reporting guidelines were cited in a publication.
In the future, tracking actual compliance to reporting guideline items may be more relevant.
9. Reporting citations to data. This practice monitors whether a given dataset shared from researchers at an institution has received citations in other works. This is an assay to data reuse and may be a relevant metric to consider alongside others when considering study impact.

Reporting trial results in a manuscript-style publication (peer reviewed or preprint).
This practice would report whether a trial registered on a trial registry had an associated manuscript-style publication within 1 year of study completion. This will include reporting in the form of preprints.
11. Reporting the number of preprints. This practice reports the frequency of preprints produced at the institution over a given timeframe.
12. Reporting systematic review results in a manuscript-style publication (peer reviewed or preprint). This practice would report whether a registered systematic review had an associated manuscript-style publication within 1 year of study completion. This will include reporting in the form of preprints.
Broader transparency practices 1. Reporting whether author contributions were reported. Journals are increasingly requiring or permitting authors to make statements (e.g., using the CREDIT Taxonomy) about their role in the publication. This helps to clarify the diversity of contributions each author has made. This practice would track the presence of these statements in publications. Monitoring the use of author contribution statements may help institutions to devise ways to recognize individual's skills when hiring and promoting researcher.
2. Reporting whether author conflicts of interest were reported. Reporting of conflicts of interest is a standard practice at many journals, but this practice is not uniform, with some publications lacking statements altogether. Monitoring conflicts of interest reporting helps to ensure transparency. In the absence of a statement of conflicts of interest, the reader cannot assume none exist. For this reason, we reached consensus that all papers should have such a statement irrespective of whether conflicts exist.
3. Reporting the use of persistent identifiers when sharing data/code/materials. Persistent identifiers such as DOIs are digital codes for online objects that remain consistent over time. Use of persistent identifiers of research outputs such as data, code, and materials foster collation and linkage.
4. Reporting whether ORCID identifiers were reported. ORCID identifiers are persistent researcher identifiers. This practice would track whether publications report these. Knowledge about use of ORCID will help inform iterations of our open science dashboard. While our dashboard will focus at the research institution level, ORCIDs may be relevant to use to collate institution publications, or to produce researcher-level outputs.

5.
Reporting whether data/code/materials are shared with a clear license. This practice monitors whether licenses are used when research outputs like data, code, and materials are shared (e.g., use of creative commons licenses).
6. Reporting whether research articles include funding statements. Reporting on funding is a standard practice at many journals and required by some funders, but this practice is not uniform, with some publications lacking statements altogether. Monitoring funding statements helps to ensure transparency and provide linkage between funding and research outputs. For this reason, we reached consensus that all papers should have funder statements irrespective of whether funding was received. In the future, knowledge of what types of funding a publication received may foster meta-research on funding allocation and research outputs.

Reporting whether the data/code/materials license is open or not.
Among research outputs shared with a license, this practice monitors the proportion of these that are "open" (i.e., publicly available with no restrictions to access when appropriate to the data).

Future directions
The next phase of this research program will involve developing the open science dashboard interface and its programming. While we aim to create a fully automated tool, some core open science practices that reached consensus for inclusion in the dashboard may not lend themselves to reliable, automated analysis. For example, the fact that digital identifiers are not widely used on some research outputs (e.g., when sharing code or study materials) may create challenges in accurate measurement. If we find this to be the case, in these instances, we will exclude the open science practice from monitoring. We chose not to restrict the community of Delphi participants in terms of the ease of automation of what they wanted in the tool-we encouraged participants to "think big." Ultimately, some items may not be possible to include due to feasibility. We anticipate iterative consultation with the community as we work to develop a dashboard that best meets their needs. As infrastructure and the use of identifiers evolve within the biomedical community, there will be a need to refresh consensus and reconsider processes used to best automate the core open science practices. We anticipate that the open science dashboard will serve as a tool for institutions to track their progress in adopting the agreed open science practices, but also to assess their performance relevant to existing mandates. For example, the dashboard will enable institutions to monitor their adherence to mandates related to open access publishing, clinical trial registration and reporting, and data sharing, all of which are commonly mandated by funders globally and related stakeholders in the research ecosystem [25][26][27]. We also anticipate that several of the open science practices included in the dashboard will not reflect practices that are widely performed or mandated. Some items may therefore reflect aspirational practices for the community. The dashboard can be used to benchmark for improvements in these areas.
The proposed dashboard is a necessary precursor for providing institutional feedback on the performance of the agreed open science practices. As we pilot implementation of the dashboard, we will consider how the tool can provide tailored feedback to individual institutions, or distinct settings. The central goal of the dashboard is not to facilitate comparison between institutions (i.e., where adherence to practices can be directly compared within the dashboard across different institutions). This type of ranking is counter to our community-driven initiative that seeks to provide a tool for institutional-level improvement in open science rather than to pit organizations, who often are situated quite differently, against one another. Our vision is that the tool will not develop to be punitive, competitive, or a prestige indicator, as this is likely to further contribute to the systematic enablement of high-resource institutions. Nonetheless, a core set of agreed practices is helpful for comparative meta-research around open science.
We intend for the dashboard to be implemented at the individual institution level. Understanding a given institution's setting, current norms, and resource circumstances will be critical to deciding how to best implement the dashboard in that environment. A key step in the program to develop the proposed dashboard will be to carefully consider the appropriateness of the dashboard being publicly available versus hosted internally by biomedical institutions. Preference is likely to vary across institutions based on their circumstances. As we implement the proposed open science dashboard, it will also be important to measure how nuances in language, geographic location, discipline, and other institutional differences impact optimal local adoption. Even subtle differences in understanding of, and experiences with, open science at different institutions may have an important impact on how an eventual dashboard can be implemented to best meet institutional needs while still retaining a core set of practices to monitor.
Over time, we will also need to monitor the dashboard itself. As open science becomes increasingly embedded in the research ecosystem, the core practices of today may differ from those of the future. During implementation, we will evaluate how the tool is impacted by subtleties and practical constraints differing between institutions, countries, and geographical regions (for example, how appropriate the tool is in a Global North versus Global South setting). Addressing the distinct challenges will help to foster harmonization in measuring open science practices in the biomedical community. We will need to monitor and stay abreast of the global communities needs and practices to ensure the dashboard is sustainable and relevant over time.

Conclusions
This Consensus View aimed to establish a core set of open science practices to monitor at biomedical institutions. The core set of practices was developed to inform items to track in a proposed automated open science dashboard, which could be deployed by institutions and report aggregated institutional-level information on performance for each practice included. The value of a consensus list of open science practices may be of broad value when developing policy, education, and interventions to improve open science in biomedicine.
Through consulting with 80 stakeholders from 20 institutions, consensus was reached on the value of tracking 19 practices in the proposed dashboard. By taking the approach of consulting the community and building consensus on the practices to include in the dashboard, we intend to develop a dashboard that best meets the needs of the community. By bringing the community together prior to developing the tool, we have also had the opportunity to brainstorm and discuss implementation strategies. We now have a roadmap to guide how to obtain community feedback on a prototype of the dashboard and a plan to pilot implementation at 3 institutions. This pilot and implementation exercise will position us to better understand barriers and enablers to adoption and use of the proposed open science dashboard [28].